NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Matching the Statistical Query Lower Bound for k-Sparse Parity Problems with Sign Stochastic Gradient Descent

Kou, Yiwen; Chen, Zixiang; Gu, Quanquan; Kakade, Sham M (December 2024, Advances in Neural Information Processing Systems (NeurIPS))

Full Text Available
Scaling Laws in Linear Regression: Compute, Parameters, and Data

Lin, Licong; Wu, Jingfeng; Kakade, Sham M; Bartlett, Peter L; Lee, Jason D (December 2024, Advances in neural information processing systems)

Full Text Available
The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift

Wu, Jingfeng; Zou, Difan; Braverman, Vladimir; Gu, Quanquan; Kakade, Sham M. (January 2022, Advances in neural information processing systems)

Full Text Available
Risk Bounds of Multi-Pass SGD for Least Squares in the Interpolation Regime

Zou, Difan; Wu, Jingfeng; Braverman, Vladimir; Gu, Quanquan; Kakade, Sham M. (January 2022, Advances in neural information processing systems)

Full Text Available
Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression

Wu, Jingfeng; Zou, Difan; Braverman, Vladimir; Gu, Quanquan; Kakade, Sham M. (January 2022, International Conference on Machine Learning)

Full Text Available
What are the Statistical Limits of Offline RL with Linear Function Approximation?

Wang, Ruosong; Foster, Dean; Kakade, Sham M. (January 2021, International Conference on Learning Representations)

Offline reinforcement learning seeks to utilize offline (observational) data to guide the learning of (causal) sequential decision making strategies. The hope is that offline reinforcement learning coupled with function approximation methods (to deal with the curse of dimensionality) can provide a means to help alleviate the excessive sample complexity burden in modern sequential decision making problems. However, the extent to which this broader approach can be effective is not well understood, where the literature largely consists of sufficient conditions. This work focuses on the basic question of what are necessary representational and distributional conditions that permit provable sample-efficient offline reinforcement learning. Perhaps surprisingly, our main result shows that even if: i) we have realizability in that the true value function of \emph{every} policy is linear in a given set of features and 2) our off-policy data has good coverage over all features (under a strong spectral condition), any algorithm still (information-theoretically) requires a number of offline samples that is exponential in the problem horizon to non-trivially estimate the value of \emph{any} given policy. Our results highlight that sample-efficient offline policy evaluation is not possible unless significantly stronger conditions hold; such conditions include either having low distribution shift (where the offline data distribution is close to the distribution of the policy to be evaluated) or significantly stronger representational conditions (beyond realizability).
more » « less
Full Text Available
Benign Overfitting of Constant-Stepsize SGD for Linear Regression

Zou, Difan; Wu, Jingfeng; Braverman, Vladimir; Gu, Quanquan; Kakade, Sham M. (January 2021, Annual Conference on Learning Theory (COLT))

Full Text Available
The Benefits of Implicit Regularization from SGD in Least Squares Problems

Zou, Difan; Wu, Jingfeng; Braverman, Vladimir; Gu, Quanquan; Foster, Dean P.; Kakade, Sham M. (January 2021, Advances in neural information processing systems)

Full Text Available
The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares

Ge, Rong; Kakade, Sham M.; Kidambi, Rahul; Netrapalli, Praneeth (December 2019, NeurIPS2019)

Minimax optimal convergence rates for numerous classes of stochastic convex optimization problems are well characterized, where the majority of results utilize iterate averaged stochastic gradient descent (SGD) with polynomially decaying step sizes. In contrast, the behavior of SGDs final iterate has received much less attention despite the widespread use in practice. Motivated by this observation, this work provides a detailed study of the following question: what rate is achievable using the final iterate of SGD for the streaming least quares regression problem with and without strong convexity? First, this work shows that even if the time horizon T (i.e. the number of iterations that SGD is run for) is known in advance, the behavior of SGDs final iterate with any polynomially decaying learning rate scheme is highly suboptimal compared to the statistical minimax rate (by a condition number factor in the strongly convex case and a factor of \sqrt{T} in the non-strongly convex case). In contrast, this paper shows that Step Decay schedules, which cut the learning rate by a constant factor every constant number of epochs (i.e., the learning rate decays geometrically) offer significant improvements over any polynomially decaying step size schedule. In particular, the behavior of the final iterate with step decay schedules is off from the statistical minimax rate by only log factors (in the condition number for the strongly convex case, and in T in the non-strongly convex case). Finally, in stark contrast to the known horizon case, this paper shows that the anytime (i.e. the limiting) behavior of SGDs final iterate is poor (in that it queries iterates with highly sub-optimal function value infinitely often, i.e. in a limsup sense) irrespective of the step size scheme employed. These results demonstrate the subtlety in establishing optimal learning rate schedules (for the final iterate) for stochastic gradient procedures in fixed time horizon settings.
more » « less
Full Text Available
Is a Good Representation Sufficient for Sample Efficient Reinforcement Learning?

Du, Simon S.; Kakade, Sham M.; Wang, Ruosong; Yang, Lin F. (January 2020, International Conference on Learning Representations)

Modern deep learning methods provide effective means to learn good representations. However, is a good representation itself sufficient for sample efficient reinforcement learning? This question has largely been studied only with respect to (worst-case) approximation error, in the more classical approximate dynamic programming literature. With regards to the statistical viewpoint, this question is largely unexplored, and the extant body of literature mainly focuses on conditions which permit sample efficient reinforcement learning with little understanding of what are necessary conditions for efficient reinforcement learning. This work shows that, from the statistical viewpoint, the situation is far subtler than suggested by the more traditional approximation viewpoint, where the requirements on the representation that suffice for sample efficient RL are even more stringent. Our main results provide sharp thresholds for reinforcement learning methods, showing that there are hard limitations on what constitutes good function approximation (in terms of the dimensionality of the representation), where we focus on natural representational conditions relevant to value-based, model-based, and policy-based learning. These lower bounds highlight that having a good (value-based, model-based, or policy-based) representation in and of itself is insufficient for efficient reinforcement learning, unless the quality of this approximation passes certain hard thresholds. Furthermore, our lower bounds also imply exponential separations on the sample complexity between 1) value-based learning with perfect representation and value-based learning with a good-but-not-perfect representation, 2) value-based learning and policy-based learning, 3) policy-based learning and supervised learning and 4) reinforcement learning and imitation learning.
more » « less
Full Text Available

« Prev Next »

Search for: All records